17 research outputs found

    동적 카메라에서 동적 물체 탐지를 위한 배경 중심 접근법

    Get PDF
    학위논문 (박사)-- 서울대학교 대학원 : 전기·컴퓨터공학부, 2017. 2. 최진영.A number of surveillance cameras have been installed for safety and security in actual environments. To achieve a human-level visual intelligence via cameras, there has been much effort to develop many computer vision algorithms realizing the various visual functions from low level to high level. Among them, the moving object detection is a fundamental function because the attention to a moving object is essential to understand its high-level behavior. Most of moving object detection algorithms in a fixed camera adopt the background-centric modeling approach. However, the background-centric approach does not work well in a moving camera because the modeling of moving background in an online way is challengeable. Until now, most algorithms for the object detection in a moving camera have relied on the object-centric approach using appearance-based recognition schemes. However, the object-centric approach suffers from the heavy computational complexity. In this thesis, we propose an efficient and robust scheme based on the background-centric approach to detect moving objects in the dynamic background environments using moving cameras. To tackle the challenges arising from the dynamic background, in this thesis, we deal with four problems: false positives from inaccurate camera motion estimation, sudden scene changes such as illumination, slow moving object relative to camera movement, and motion model limitation in a dashcam video. To solve the false positives due to motion estimation error, we propose a new scheme to improve the robustness of moving object detection in a moving camera. To lessen the influence of background motion, we adopt a dual-mode kernel model that builds two background models using a grid-based modeling. In addition, to reduce the false detections and the missing of true objects, we introduce an attentional sampling scheme based on spatio-temporal properties of moving objects. From the spatio-temporal properties, we build a foreground probability map and generate a sampling map which selects the candidate pixels to find the actual objects. We apply the background subtraction and model update with attention to only the selected pixels. To resolve sudden scene changes and slow moving object problems, we propose a situation-aware background learning method that handles dynamic scenes for moving object detection in a moving camera. We suggest new modules that utilizes situation variables and builds a background model adaptively. Our method compensates for camera movement and updates the background model according to the situation variables. The situation-aware scheme enables the algorithm to build a clear background model without contamination by the foreground. To overcome the limitation of motion model in a dashcam video, we propose a prior-based attentional update scheme to handle dynamic scene changes. Motivated by the center-focused and structure-focused tendencies of human attention, we extend the compensation-based method that focuses on the center changes and neglects minor changes on the important scene structure. The center-focused tendency is implemented by increasing the learning rate of the boundary region through the multiplication of the attention map and the age model. The structure-focused tendency is used to build a robust background model through the model selection after the road and sky region are estimated. In experiments, the proposed framework shows its efficiency and robustness through qualitative and quantitative comparison evaluation with the state-of-the arts. Through the first scheme, it takes only 4.8 ms in one frame processing without parallel processing. The second scheme enables to adapt rapidly changing scenes while maintaining the performance and speed. Through the third scheme for the driving situation, successful results are shown in background modeling and moving object detection in dashcam videos.1 Introduction 1 1.1 Background 1 1.2 Related works 4 1.3 Contributions 10 1.4 Contents of Thesis 11 2 Problem Statements 13 2.1 Background-centric approach for a fixed camera 13 2.2 Problem statements for a moving camera 17 3 Dual modeling with Attentional Sampling 25 3.1 Dual-mode modeling for a moving camera 26 3.1.1 Age model for adaptive learning rate 28 3.1.2 Grid-based modeling 29 3.1.3 Dual-mode kernel modeling 32 3.1.4 Motion compensation by mixing models 35 3.2 Dual-mode modeling with Attentional sampling 36 3.2.1 Foreground probability map based on occurrence 37 3.2.2 Sampling Map Generation 41 3.2.3 Model update with sampling map 43 3.2.4 Probabilistic Foreground Decision 44 3.3 Benefits 45 4 Situation-aware Background Learning 47 4.1 Situation Variable Estimation 51 4.1.1 Background Motion Estimation 51 4.1.2 Foreground Motion Estimation 52 4.1.3 Illumination Change Estimation 53 4.2 Situation-Aware Background Learning 54 4.2.1 Situation-Aware Warping of the Background Model 54 4.2.2 Situation-Aware Update of the Background Model 55 4.3 Foreground Decision 58 4.4 Benefits 59 5 Prior-based Attentional Update for dashcam video 61 5.1 Camera Motion Estimation 65 5.2 Road and Sky region estimation 66 5.3 Background learning 69 5.4 Foreground Result Combining 75 5.5 Benefits 77 6 Experiments 79 6.1 Qualitative Comparisons 82 6.1.1 Dual modeling with attentional sampling 82 6.1.2 Situation-aware background learning 84 6.1.3 Prior-based attentional update 88 6.2 Quantitative Comparisons 91 6.2.1 Dual modeling with attentional sampling 91 6.2.2 Situation-aware background learning 91 6.2.3 Prior-based attentional update (PBAU) 93 6.2.4 Runtime evaluation 94 6.2.5 Unified framework 94 6.3 Application: combining with recognition algorithm 98 6.4 Discussion 102 6.4.1 Issues 102 6.4.2 Strength 104 6.4.3 Limitation 105 7 Concluding remarks and Future works 109 Bibliography 113 초록 125Docto

    Localization Uncertainty Estimation for Anchor-Free Object Detection

    Full text link
    Since many safety-critical systems, such as surgical robots and autonomous driving cars, are in unstable environments with sensor noise and incomplete data, it is desirable for object detectors to take into account the confidence of localization prediction. There are three limitations of the prior uncertainty estimation methods for anchor-based object detection. 1) They model the uncertainty based on object properties having different characteristics, such as location (center point) and scale (width, height). 2) they model a box offset and ground-truth as Gaussian distribution and Dirac delta distribution, which leads to the model misspecification problem. Because the Dirac delta distribution is not exactly represented as Gaussian, i.e., for any μ\mu and Σ\Sigma. 3) Since anchor-based methods are sensitive to hyper-parameters of anchor, the localization uncertainty modeling is also sensitive to these parameters. Therefore, we propose a new localization uncertainty estimation method called Gaussian-FCOS for anchor-free object detection. Our method captures the uncertainty based on four directions of box offsets~(left, right, top, bottom) that have similar properties, which enables to capture which direction is uncertain and provide a quantitative value in range~[0, 1]. To this end, we design a new uncertainty loss, negative power log-likelihood loss, to measure uncertainty by weighting IoU to the likelihood loss, which alleviates the model misspecification problem. Experiments on COCO datasets demonstrate that our Gaussian-FCOS reduces false positives and finds more missing-objects by mitigating over-confidence scores with the estimated uncertainty. We hope Gaussian-FCOS serves as a crucial component for the reliability-required task

    Occluded Pedestrian-Attribute Recognition for Video Sensors Using Group Sparsity

    No full text
    Pedestrians are often obstructed by other objects or people in real-world vision sensors. These obstacles make pedestrian-attribute recognition (PAR) difficult; hence, occlusion processing for visual sensing is a key issue in PAR. To address this problem, we first formulate the identification of non-occluded frames as temporal attention based on the sparsity of a crowded video. In other words, a model for PAR is guided to prevent paying attention to the occluded frame. However, we deduced that this approach cannot include a correlation between attributes when occlusion occurs. For example, “boots” and “shoe color” cannot be recognized simultaneously when the foot is invisible. To address the uncorrelated attention issue, we propose a novel temporal-attention module based on group sparsity. Group sparsity is applied across attention weights in correlated attributes. Accordingly, physically-adjacent pedestrian attributes are grouped, and the attention weights of a group are forced to focus on the same frames. Experimental results indicate that the proposed method achieved 1.18% and 6.21% higher F1-scores than the advanced baseline method on the occlusion samples in DukeMTMC-VideoReID and MARS video-based PAR datasets, respectively

    Position-aware Location Regression Network for Temporal Video Grounding

    No full text
    The key to successful grounding for video surveillance is to understand a semantic phrase corresponding to important actors and objects. Conventional methods ignore comprehensive contexts for the phrase or require heavy computation for multiple phrases. To understand comprehensive contexts with only one semantic phrase, we propose Position-aware Location Regression Network (PLRN) which exploits position-aware features of a query and a video. Specifically, PLRN first encodes both the video and query using positional information of words and video segments. Then, a semantic phrase feature is extracted from an encoded query with attention. The semantic phrase feature and encoded video are merged and made into a context-aware feature by reflecting local and global contexts. Finally, PLRN predicts start, end, center, and width values of a grounding boundary. Our experiments show that PLRN achieves competitive performance over existing methods with less computation time and memory.Y

    Motion Interaction Field for Accident Detection in Traffic Surveillance Video

    No full text
    This paper presents a novel method for modeling of interaction among multiple moving objects to detect traffic accidents. The proposed method to model object interactions is motivated by the motion of water waves responding to moving objects on water surface. The shape of the water surface is modeled in a field form using Gaussian kernels, which is referred to as the Motion Interaction Field (MIF). By utilizing the symmetric properties of the MIF, we detect and localize traffic accidents without solving complex vehicle tracking problems. Experimental results show that our method outperforms the existing works in detecting and localizing traffic accidents

    CuCrO2 Nanoparticles Incorporated into PTAA as a Hole Transport Layer for 85 degrees C and Light Stabilities in Perovskite Solar Cells

    No full text
    High-mobility inorganic CuCrO2 nanoparticles are co-utilized with conventional poly(bis(4-phenyl)(2,5,6-trimethylphenyl)amine) (PTAA) as a hole transport layer (HTL) for perovskite solar cells to improve device performance and long-term stability. Even though CuCrO2 nanoparticles can be readily synthesized by hydrothermal reaction, it is difficult to form a uniform HTL with CuCrO2 alone due to the severe agglomeration of nanoparticles. Herein, both CuCrO2 nanoparticles and PTAA are sequentially deposited on perovskite by a simple spin-coating process, forming uniform HTL with excellent coverage. Due to the presence of high-mobility CuCrO2 nanoparticles, CuCrO2/PTAA HTL demonstrates better carrier extraction and transport. A reduction in trap density is also observed by trap-filled limited voltages and capacitance analyses. Incorporation of stable CuCrO2 also contributes to the improved device stability under heat and light. Encapsulated perovskite solar cells with CuCrO2/PTAA HTL retain their efficiency over 90% after similar to 900-h storage in 85 degrees C/85% relative humidity and under continuous 1-sun illumination at maximum-power point
    corecore